Goto

Collaborating Authors

 computer science education


Understanding Student Interaction with AI-Powered Next-Step Hints: Strategies and Challenges

arXiv.org Artificial Intelligence

Automated feedback generation plays a crucial role in enhancing personalized learning experiences in computer science education. Among different types of feedback, next-step hint feedback is particularly important, as it provides students with actionable steps to progress towards solving programming tasks. This study investigates how students interact with an AI-driven next-step hint system in an in-IDE learning environment. We gathered and analyzed a dataset from 34 students solving Kotlin tasks, containing detailed hint interaction logs. We applied process mining techniques and identified 16 common interaction scenarios. Semi-structured interviews with 6 students revealed strategies for managing unhelpful hints, such as adapting partial hints or modifying code to generate variations of the same hint. These findings, combined with our publicly available dataset, offer valuable opportunities for future research and provide key insights into student behavior, helping improve hint design for enhanced learning support.


Exploring Student Choice and the Use of Multimodal Generative AI in Programming Learning

arXiv.org Artificial Intelligence

The broad adoption of Generative AI (GenAI) is impacting Computer Science education, and recent studies found its benefits and potential concerns when students use it for programming learning. However, most existing explorations focus on GenAI tools that primarily support text-to-text interaction. With recent developments, GenAI applications have begun supporting multiple modes of communication, known as multimodality. In this work, we explored how undergraduate programming novices choose and work with multimodal GenAI tools, and their criteria for choices. We selected a commercially available multimodal GenAI platform for interaction, as it supports multiple input and output modalities, including text, audio, image upload, and real-time screen-sharing. Through 16 think-aloud sessions that combined participant observation with follow-up semi-structured interviews, we investigated student modality choices for GenAI tools when completing programming problems and the underlying criteria for modality selections. With multimodal communication emerging as the future of AI in education, this work aims to spark continued exploration on understanding student interaction with multimodal GenAI in the context of CS education.


Comparative Analysis of STEM and non-STEM Teachers' Needs for Integrating AI into Educational Environments

arXiv.org Artificial Intelligence

There is an increasing imperative to integrate programming platforms within AI frameworks to enhance educational tasks for both teachers and students. However, commonly used platforms such as Code.org, Scratch, and Snap fall short of providing the desired AI features and lack adaptability for interdisciplinary applications. This study explores how educational platforms can be improved by incorporating AI and analytics features to create more effective learning environments across various subjects and domains. We interviewed 8 K-12 teachers and asked their practices and needs while using any block-based programming (BBP) platform in their classes. We asked for their approaches in assessment, course development and expansion of resources, and student monitoring in their classes. Thematic analysis of the interview transcripts revealed both commonalities and differences in the AI tools needed between the STEM and non-STEM groups. Our results indicated advanced AI features that could promote BBP platforms. Both groups stressed the need for integrity and plagiarism checks, AI adaptability, customized rubrics, and detailed feedback in assessments. Non-STEM teachers also emphasized the importance of creative assignments and qualitative assessments. Regarding resource development, both AI tools desired for updating curricula, tutoring libraries, and generative AI features. Non-STEM teachers were particularly interested in supporting creative endeavors, such as art simulations. For student monitoring, both groups prioritized desktop control, daily tracking, behavior monitoring, and distraction prevention tools. Our findings identify specific AI-enhanced features needed by K-12 teachers across various disciplines and lay the foundation for creating more efficient, personalized, and engaging educational experiences.


Synthesizing High-Quality Programming Tasks with LLM-based Expert and Student Agents

arXiv.org Artificial Intelligence

Generative AI is transforming computing education by enabling the automatic generation of personalized content and feedback. We investigate its capabilities in providing high-quality programming tasks to students. Despite promising advancements in task generation, a quality gap remains between AI-generated and expert-created tasks. The AI-generated tasks may not align with target programming concepts, could be incomprehensible to students, or may contain critical issues such as incorrect tests. Existing works often require interventions from human teachers for validation. We address these challenges by introducing PyTaskSyn, a novel synthesis technique that first generates a programming task and then decides whether it meets certain quality criteria to be given to students. The key idea is to break this process into multiple stages performed by expert and student agents simulated using both strong and weaker generative models. Through extensive evaluation, we show that PyTaskSyn significantly improves task quality compared to baseline techniques and showcases the importance of each specialized agent type in our validation pipeline. Additionally, we conducted user studies using our publicly available web application and show that PyTaskSyn can deliver high-quality programming tasks comparable to expert-designed ones while reducing workload and costs, and being more engaging than programming tasks that are available in online resources.


Insights from Interviews with Teachers and Students on the Use of a Social Robot in Computer Science Class in Sixth Grade

arXiv.org Artificial Intelligence

-- In this paper we report on first insights from interviews with teachers and students on using social robots in computer science class in sixth grade. Our focus is on learning about requirements and potential applications. We are particularly interested in getting both perspectives, the teachers' and the learners' view on how robots could be used and what features they should or should not have. Results show that teachers as well as students are very open to robots in the classroom. However, requirements are partially quite heterogeneous among the groups. This leads to complex design challenges which we discuss at the end of this paper . I. INTRODUCTION Robots have diverse applications across domains such as healthcare, industry, and education.


Teaching Introduction to Programming in the times of AI: A case study of a course re-design

arXiv.org Artificial Intelligence

The integration of AI tools into programming education has become increasingly prevalent in recent years, transforming the way programming is taught and learned. This paper provides a review of the state - of - the - art AI tools available for teaching and learn ing programming, particularly in the context of introductory courses. It highlights the challenges on course design, learning objectives, course delivery and formative and summative assessment, as well as the misuse of such tools by the students. We discus s ways of re - designing an existing course, re - shaping assignments and pedagogy to address the current AI technologies challenges. This example can serve as a guideline for policies for institutions and teachers involved in teaching programming, aiming to m aximize the benefits of AI tools while addressing the associated challenges and concerns.


A Review of Generative AI in Computer Science Education: Challenges and Opportunities in Accuracy, Authenticity, and Assessment

arXiv.org Artificial Intelligence

This paper surveys the use of Generative AI tools, such as ChatGPT and Claude, in computer science education, focusing on key aspects of accuracy, authenticity, and assessment. Through a literature review, we highlight both the challenges and opportunities these AI tools present. While Generative AI improves efficiency and supports creative student work, it raises concerns such as AI hallucinations, error propagation, bias, and blurred lines between AI-assisted and student-authored content. Human oversight is crucial for addressing these concerns. Existing literature recommends adopting hybrid assessment models that combine AI with human evaluation, developing bias detection frameworks, and promoting AI literacy for both students and educators. Our findings suggest that the successful integration of AI requires a balanced approach, considering ethical, pedagogical, and technical factors. Future research may explore enhancing AI accuracy, preserving academic integrity, and developing adaptive models that balance creativity with precision.


Narrowing the Gap: Supervised Fine-Tuning of Open-Source LLMs as a Viable Alternative to Proprietary Models for Pedagogical Tools

arXiv.org Artificial Intelligence

Frontier Large language models (LLMs) like ChatGPT and Gemini can decipher cryptic compiler errors for novice programmers, but their computational scale, cost, and tendency to over-assist make them problematic for widespread pedagogical adoption. This work demonstrates that smaller, specialised language models, enhanced via Supervised Fine-Tuning (SFT), present a more viable alternative for educational tools. We utilise a new dataset of 40,000 C compiler error explanations, derived from real introductory programming (CS1/2) student-generated programming errors, which we used to fine-tune three open-source models: Qwen3-4B, Llama-3.1-8B, and Qwen3-32B. We performed a dual evaluation, combining expert human reviews with a large-scale automated analysis of 8,000 responses using a validated LLM-as-judge ensemble. Our results show that SFT significantly boosts the pedagogical quality of smaller models, achieving performance comparable to much larger models. We analyse the trade-offs between model size and quality, confirming that fine-tuning compact, efficient models on high-quality, domain-specific data is a potent strategy for creating specialised models to drive educational tools. We provide a replicable methodology to foster broader access to generative AI capabilities in educational contexts.


Automated Identification of Logical Errors in Programs: Advancing Scalable Analysis of Student Misconceptions

arXiv.org Artificial Intelligence

In Computer Science (CS) education, understanding factors contributing to students' programming difficulties is crucial for effective learning support. By identifying specific issues students face, educators can provide targeted assistance to help them overcome obstacles and improve learning outcomes. While identifying sources of struggle, such as misconceptions, in real-time can be challenging in current educational practices, analyzing logical errors in students' code can offer valuable insights. This paper presents a scalable framework for automatically detecting logical errors in students' programming solutions. Our framework is based on an explainable Abstract Syntax Tree (AST) embedding model, the Subtree-based Attention Neural Network (SANN), that identifies the structural components of programs containing logical errors. We conducted a series of experiments to evaluate its effectiveness, and the results suggest that our framework can accurately capture students' logical errors and, more importantly, provide us with deeper insights into their learning processes, offering a valuable tool for enhancing programming education.


Can We Build AI That Does Not Harm Queer People?

Communications of the ACM

AI safety is a contentious topic. While some prominent figures of the AI community have argued that destructive general artificial intelligence (AI) is on the horizon, others derided their warning as a marketing stunt to sell large language models (LLMs). "If the call for'AI safety' is couched in terms of protecting humanity from rogue AIs, it very conveniently displaces accountability away from the corporations scaling harm in the name of profits," tweeted Emily Bender, a professor of computational linguistics at the University of Washington. Focusing on potential future harm from ever more powerful AI systems distracts from harm that is already happening today. Most of us do not set out to make software that is actively harmful.